-
Notifications
You must be signed in to change notification settings - Fork 38
[gce_testing] Create findMatchingLogs to return all found logs and create QueryAllLogs.
#485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Why not just leave |
hasMatchingLog and QueryLog.hasMatchingLog to return all found logs and create QueryAllLogs.
Thank you for the comment @jefferbrecht ! I was going back and forth between the possibilities of how to expose the "found log count" while minimizing the API updates. The last update returns all found logs in |
| } | ||
| matchingLogs = append(matchingLogs, entry) | ||
| } | ||
| if found && len(matchingLogs) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: append here is guaranteed to result in a non-empty slice, and we set found at the same time as append, so we no longer need the found variable -- just test on len(matchingLogs) > 0. We can also eliminate the returned bool: if the query turns up empty then we'd signal "logs not found" by returning an empty slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Renamed hasMatchingLogs to findMatchingLogs to represent more accurately the new implementation.
| @@ -666,10 +667,36 @@ | |||
| // found after some retries. | |||
| func QueryLog(ctx context.Context, logger *log.Logger, vm *VM, logNameRegex string, window time.Duration, query string, maxAttempts int) (*cloudlogging.Entry, error) { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to implement QueryLog by just calling QueryAllLogs internally and returning the first element of the slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
| // over the trailing time interval specified by the given window. | ||
| // Returns all the log entries found, or an error if the log could not be | ||
| // found after some retries. | ||
| func QueryAllLogs(ctx context.Context, logger *log.Logger, vm *VM, logNameRegex string, window time.Duration, query string, maxAttempts int) ([]*cloudlogging.Entry, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QueryAllLogs arguably does not need the "retry until at least one log is found" logic (although the "retry on retriable failure" logic should stay). IMO that's only there for "get me the first matching log ASAP" (which is just QueryLog), whereas here the caller is more likely to be looking for "get me all matching logs from the given time window" (after they've waited an appropriate amount of time for the eventual consistency of the backend).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how to remove "retry until at least one log is found" from QueryAllLogs without having some downsides.
@jefferbrecht Could you add more details in the expected solution here ?
IMO that's only there for "get me the first matching log ASAP"
I think it could also be thought as "return query result if you find any data". It has the additional value of shorter tests.
QueryAllLogsarguably does not need the "retry until at least one log is found" logic (although the "retry on retriable failure" logic should stay).
Yes, "get me all matching logs from the given time window" maybe the most likely use case of QueryAllLogs, but i don't see an alternative way of "stopping the lookup" that is more helpful for testing.
Alternatives :
- [Current] Stops querying the backend if any data is found (
len(matchingLogs) > 0).- Shorter tests.
- Caller is expecting to see "all found data".
- Stop when
err == nil- Shorter tests. May result in "no data found".
- No consideration for race conditions when log may take longer to be ingested.
- Create an
expectedLogCountand stop whenlen(matchingLogs) >= expectedLogCount.- Same as the previous one, but more restrictive.
- Remove the "stop after data is found" will result in the always querying "maxAttempts".
- The tests is long (
maxAttempt * backoffTime) and if the time window is small (e.g. 1 minute), the last query could return less data after backoff retries.
- The tests is long (
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further thought, i still think stopping the query at len(matchingLogs) > 0 is the best option since its the most aligned with how the other QueryLog and WaitForLog functions have the expectation of finding data (at least one log).
To expectzero logs we can use AssertLogMissing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "stop as soon as we find one log" optimization is something you can only do when the caller is interested in finding at least one log. You cannot do that optimization in all other scenarios.
Consider the use cases we have today:
- I wrote exactly one log to the Ops Agent, and I expect it to show up in the backend eventually, and I don't care whether it is ingested multiple times -- most of our tests do this
- I wrote exactly one log to the Ops Agent, and I expect it to show up in the backend eventually, and I do care whether it is ingested multiple times -- this is the test we're talking about here
- I wrote exactly one log to the Ops Agent, and I've configured the Ops Agent to drop that log, so I expect that log to never show up in the backend -- this would be the
exclude_logstests
For (1), you are less bound by eventual consistency because you don't care what happens after you find the log you sent, so you're able to aggressively optimize the check by querying many times in quick succession until you find it.
For (2), it would be an error to stop at the first log because you can't assume that all logs will appear in the backend at the same time. It is possible -- even moreso for logs sent in different batches like we're doing -- that the second log could appear much later than the first. So the caller is required to wait an appropriate amount of time first for eventual consistency and then query for all of the logs. And at that point there's zero value in looping until you find a log: the assumption is that the backend is now fully consistent, because we already waited for it, so the log is either there or it isn't. If the query still returns zero logs after waiting for eventual consistency then I want the test to fail, because that means I need to fix my test (or the Ops Agent is actually broken).
(3) is simpler relative of (2): you need to wait for eventual consistency and then check that there are no logs. So you could reimplement AssertLogMissing in terms of QueryAllLogs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed explanation! I see now the value of returning the successful query data in (2) and (3). Also, this way (3) and (2) are very similar indeed.
Result : I implemented the following :
- (1) : Implemented of
QueryLogthat returns the first found log. - (2) : Implemented
QueryAllLogsthat returns the first successful query and does retries only for "retriable" errors. - (3) : Implemented
AssertLogMissingusingQueryAllLogs.
hasMatchingLog to return all found logs and create QueryAllLogs. findMatchingLogs to return all found logs and create QueryAllLogs.
32b459e to
fb6c69a
Compare
…found logs in the backend from a log query.
…Refactor `QueryLog` to use `QueryAllLogs`.
695408d to
4201281
Compare
Update
hasMatchingLogto return all found logs and createQueryAllLogs.Context :
file offsetstorage to Otel Logging receivers. ops-agent#2166, we need to able to know how many logs did a query returned.